Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add filters

Database
Language
Document Type
Year range
1.
7th Arabic Natural Language Processing Workshop, WANLP 2022 held with EMNLP 2022 ; : 1-10, 2022.
Article in English | Scopus | ID: covidwho-2290872

ABSTRACT

Named Entity Recognition (NER) is a well-known problem for the natural language processing (NLP) community. It is a key component of different NLP applications, including information extraction, question answering, and information retrieval. In the literature, there are several Arabic NER datasets with different named entity tags;however, due to data and concept drift, we are always in need of new data for NER and other NLP applications. In this paper, first, we introduce Wassem, a web-based annotation platform for Arabic NLP applications. Wassem can be used to manually annotate textual data for a variety of NLP tasks: text classification, sequence classification, and word segmentation. Second, we introduce the COVID-19 Arabic Named Entities Recognition (CAraNER) dataset extracted from the Arabic Newspaper COVID-19 Corpus (AraNPCC). CAraNER has 55,389 tokens distributed over 1,278 sentences randomly extracted from Saudi Arabian newspaper articles published during 2019, 2020, and 2021. The dataset is labeled by five annotators with five named-entity tags, namely: Person, Title, Location, Organization, and Miscellaneous. The CAraNER corpus is available for download for free. We evaluate the corpus by finetuning four BERT-based Arabic language models on the CAraNER corpus. The best model was AraBERTv0.2-large with 0.86 for the F1 macro measure. © 2022 Association for Computational Linguistics.

2.
Journal of Beijing Institute of Technology (English Edition) ; 31(3):285-292, 2022.
Article in English | Scopus | ID: covidwho-1924761

ABSTRACT

Single-cell RNA-sequencing (scRNA-seq) is a rapidly increasing research area in biomedical signal processing. However, the high complexity of single-cell data makes efficient and accurate analysis difficult. To improve the performance of single-cell RNA data processing, two single-cell features calculation method and corresponding dual-input neural network structures are proposed. In this feature extraction and fusion scheme, the features at the cluster level are extracted by hierarchical clustering and differential gene analysis, and the features at the cell level are extracted by the calculation of gene frequency and cross cell frequency. Our experiments on COVID-19 data demonstrate that the combined use of these two feature achieves great results and high robustness for classification tasks. © 2021 Journal of Beijing Institute of Technology

SELECTION OF CITATIONS
SEARCH DETAIL